Projection predictive variable selection for Bayesian regularized SEM
Context: Regularized SEM, i.e., models with many parameters and a penalty function (frequentist) or shrinkage prior (Bayesian).
Goal: Providing a more formal approach to select parameters (and thus models) in Bayesian regularized SEM.
MIMIC model drawn with https://semdiag.psychstat.org
A shrinkage prior takes the role of the penalty:
\[ posterior \propto likelihood \times prior \]
Ideal shrinkage prior:
Many different shrinkage priors exist (see e.g., Van Erp, Oberski, and Mulder (2019)).
In Bayesian regularized SEM, parameters are not automatically set to zero.
Goal: Finding a smaller submodel that predicts practically as good as the larger reference model.
library(lavaan)
library(brms)
library(projpred)
mod <- 'F =~ y1 + y2 + y3 + y4 + y5
F ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10'
fit.lavaan <- sem(mod, data = df)
fs <- lavPredict(fit.lavaan, method = "Bartlett")
df$fs <- as.vector(fs)
refm_fit <- brm(fs ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10,
data = df,
prior = prior_hs)
refm_obj <- get_refmodel(refm_fit)
cvvs <- cv_varsel(
refm_obj,
cv_method = "kfold",
K = 10
)
plot(cvvs)
When would we expect projpredSEM to be beneficial?
Note: projpredSEM is much slower than traditional criteria.
Feel free to reach out during this conference, or via e-mail: s.j.vanerp@uu.nl.
Sara van Erp